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Abstract: An interesting statistical problem is to find regions where some studied process exceeds 
a certain level. Estimating such regions so that the probability for exceeding the level in the entire 
set is equal to some predefined value is a difficult problem that occurs in several areas of applications 
ranging from brain imaging to astrophysics. In this work, a method for solving this problem, as 
well as the related problem of finding uncertainty regions for contour curves, for latent Gaussian 
models is proposed. The method is based on using a parametric family for the excursion sets in 
combination with a sequential importance sampling method for estimating joint probabilities. The 
accuracy of the method is investigated using simulated data and two environmental applications are 
presented. In the first application, areas where the air pollution in the Piemonte region in northern 
Italy exceeds the daily limit value, set by the European Union for human health protection, are 
estimated. In the second application, regions in the African Sahel that experienced an increase in 
vegetation after the drought period in the early 1980s are estimated. 
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In many statistical applications, one is interested in finding areas where the studied process exceeds 
a certain level or is significantly different from some reference level. A typical example is in studies 
of air pollution, where one is interested in test ing if, and where, the pollution level exceeds some 
given limit value set by some regulatory agency [Cameletti et all I2012II. and similar examples ca n be 
found in a w ide range of s cienti fic fields including brain imaging T^chiui and Presamsl . B and 



astrophysics |f3eakv et al. . 19921 ] . In spatio-temporal applications one might be interested in finding 



regions that have experienced significant changes over the studied time period. This is a common 
probl em in climate science and the s tudied quantity can for example be t emperature iFurrer et al 



2007], precipitation [Sain et all 120111 ]. or vegetation [Eklundh and Olssonl . 120031 . iBolin et all 12009], 

The quintessential problem is that one has observations y of some latent stochastic field x(s) and 
wants to find a region D such that, with a certain given probability, x(s) > u for all s £ D for a given 
level u. The easiest, and most common, way of specifying D is to choose it as the set of locations 



D m = {s : P(x(s) > u) >1 — a}, 



(1) 



where the probability is taken under the posterior distribution for x\y. Thus, D m is specified as the 
set of locations where the marginal probability for exceeding the level exceeds some given value 1 — a. 
The set can be calculated using pointwise hypothesis testing and the parameter a then acts as the type 
1 error parameter, and thus controls how confident one is that the level is exceeded at each location. 
The problem with this definition of D is that of multiple hypothesis testing; the confidence level a 
does not give us any information about the family-wise error rate, and hence does not quantify the 
certainty of the level being exceeded at all points in the set simultaneously. That is, the probability 
P(x(s) > u, s £ D m ) is in general smaller than 1 — a. If one instead wants to choose D so that this 
simultaneous probability is 1 — a, one has to modify the procedure for constructing the set. 

The more general problem of multiple hypothesis testing is an active research area and there exists 
a number of proposed solutions for problems in various contexts. Most of these solutions are based 
on first calculating the marginal probabilities P(x(s) > u), then calculating a single threshold value 
t, and finally defining the exceedance region as D = {s : P(x(s) > u) > t}. The methods differ in 



1 



how the threshold t is calculated, and can basically be divided into three main categories; type 1 
error control thresholding, fa lse discovery rate thresholding, and posterior proba bility t hresh olding 
WchnnandPresaml fa]. The most popular method is likely the method by lAdlerl [198l| using 



the Euler characteristic of the latent field to control the family-wise error rate when defining the 
threshold t. Though this method is simple to use, one has to be careful to check whether the required 
assumptions are satisfied. Typically the method is accurate for large values of it, and the latent field 
is assumed to be stationary. 

In this work we will focus on the problem where the latent spatial field x(s) is Gaussian and 
measured at a set of irregular locations. This means that the posterior distribution ir(x\y) is non- 
stationary and typically non-Gaussian unless the measurements are Gaussian and the model param- 
eters are known a priori. One of our motivating examples is the problem of finding r e gions with 
significant c hange s in vegetation in the African Sahel studied by lEklundh and Qlssonl [2003l | and 



Bolin et al.l [2009], and in this example the threshold u is zero, so methods based on asymptotic 
arguments when u goes to infinity are unlikely to perform well. 

The method derived here is based on using a parametric family for the excursion sets in combina- 
tion with a sequential importance sampling method for estimating joint probabilities. For a specific 
choice of the parametric family, the method is equivalent to the thresholding methods mentioned 
above, with the important difference that the correct joint distribution is used when selecting the 
threshold. The method is extended using more general parametric families, and the related problem 
of finding uncertainty regions for contour curves is treated using the same methodology. 

The structure of the article is as follows. In Section [2j the problem is formulated and definitions 
for excursion sets and uncertainty regions for contour curves are given. In Section [3l a method for 
estimating these sets is proposed. Estimating the sets is the most difficult problem as one easily 
runs into computational difficulties arising from having to evaluate high-dimensional integrals. In 
Section U the methods are tested on a few simulated examples to test the method's accuracy. Two 
applications to real data are covered in Section [5l the first considers air pollution data from the 
North-Italian region Piemonte, and the second considers estimation of spatially dependent vegetation 
trends in the African Sahel. Finally, a few remarks and comments are given in Section [6l 



2 Problem formulation 

There are a number of different ways one could formulate excursion sets, and not all of them are 
useful from a practical point of view. Hence, in this section we will formalise the problem and discuss 
how the results should be interpreted. More precisely we look at two connected problems. The first 
one is to find areas where a stochastic process exceeds a given level with some probability and the 
second one is to quantify the uncertainty in contour curves of stochastic fields. 

Throughout this section, let f! be a bounded domain of R n , or have a well-defined area < oo. 
First some notation for excursion sets of a function and contour sets is needed. 

Definition 2.1 (Excursion sets for functions). Given a function /(s), s E $7, the positive excursion 
set for a level u is given by 

A+(f) = { s en-J(s)>u}. 

Similarly 

A-(f) = {sen ] f(s)<u}. 

is the negative excursion set. 

In a similar fashion one could now define the set of contour points for the level u as the set of 
points s for which /(s) = u; however, a contour curve consists not only of these points but also 
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discontinuous crossings of the level u. In order to incorporate both continuous and discontinuous 
crossings, a contour point is defined as a point s such that in every neighborhood B of s 

3si,s 2 £ B : /(si) > u,/(s 2 ) < it. 

The set of all such points is the complement of the union of the interior sets of the positive and 
negative excursion sets. 

Definition 2.2 (Contour sets for functions). Given a function /(s), s € f2, the contour set A° u for a 
level u is given by 

Ai(f) = (A+uyuA-(fry. 

where A° is the interior, relative to Q, of the set A and A c is the complement. 

Remark 1. Taking the interiors of the sets A£(f) and A~(f) is important. Consider for example the 
following function on = [0, 1] 

f-i o< s <o.5 

[1 0.5 < 8 < 1. 

In this case Aq(/)UA^"(/) = tt, so without taking the interiors of the sets Aq(J) would be empty when 
we want to include the discontinuous crossing at 0.5 in the contour set. It is also important to take 
the interiors with respect to and not R, since the endpoints and 1 always would be included in the 
contour set otherwise. This may seem as only a theoretical nicety, but the problem with discontinuous 
functions occurs frequently in environmental applications when discontinuous covariates are used for 
the mean value function of the field. This makes it essential to not treat contour sets as regions where 
the function lies close to a level, but rather as regions where level crossings occur. 

The statistical problem is now to find a region D such that the function x(s) exceeds the level 
u with a certain probability 1 — a for all s € D. There might be many such regions, so if one is 
interested in a single answer one might look for the largest of these. 

Definition 2.3 (Excursion sets). Let x(s), s € O be a random field (or process). The positive level 
u excursion set with probability 1 — a is given by 

E+ a {x) = argmax{[£»| : P(D C A+{x)) > 1 - a). 

D 

Similarly 

E~ a {x) = argmax{|D| : P(D C A~(x)) > 1 - a}. 

D 

is the negative level u excursion set with probability 1 — a. 

Remark 2. The set E^ a {x) can also be formulated as the largest set D for which P(inf sg £> x(s) < 
u) < a, which can be useful when calculating the set in practice. Also note that for deterministic 
functions / one has E+ a (f) = A+(f) and E~ a (f) = A~(f) for any a € [0, 1]. 

It is important to realize how the excursion set E+ a (x) should be interpreted: It is the largest 
set so that the level u is exceeded at all locations in the set with probability 1 — a, and therefore is 
a smaller set than D m defined in JT]), which is the set of points where the marginal probability for 
exceeding the level is at least 1 — a. Another possible definition of an excursion set would be a set 
that contains all excursions with probability 1 — a. This is a larger set than D m , given by E~ a (x) c . 
Which set one is interested in depends on the application, but it can be a good idea to calculate both 
to get a better understanding of the uncertainties in the problem. 

In certain applications, one might be interested in joint positive and negative excursions from 
some level, for example when doing simultaneous regressions and one is interested in finding regions 
where the slopes are significantly different from zero (see Section 15.21 for a possible scenario of this 
kind). 
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Definition 2.4 (Level avoiding sets). Let x(s), s € SI be a random field. The pair of level u avoiding 
sets with probability 1 — a is given by 

(M+ a (x),M- a (x)) = argmax{|L»- U D+\ : P(ZT C ^(x), C > 1 - a}. 

(£>+,£»-) 

Denote the union of these two sets the level avoiding set il/f^Q,: 

M Uja {x) = M+ a (x) U M~ a {x). 

Remark 3. The sets M+ a (x) and M~ a (x) must be non-overlapping for the probability to be non-zero. 
The set M u , a can be calculated as an excursion set itself. To see this, define a new random process 
2/(s) by 



y( s ) 



u — x(s), s € D 
x(s) —u, s D~ 



The probability calculation in Definition 12.41 can now be reformulated as an ordinary excursion prob- 
ability in y: 

P(D- C A-(x), D + C = P(D- Ufl+C A+(y)). 

Also in this case, the set can be found using a reformulation using the infimum over the region as 
P(inf se D-uD+ y(s) < 0) < a. 

Similarly to how the contour sets for deterministic functions were defined, the pair of level avoiding 
sets can now be used to define uncertainty regions for contour curves. 

Definition 2.5 (Uncertainty region for contour sets). Let x(s), s S Q be a random field, and let 
(M+ a (x), M~ a {x)) be the pair of level avoiding sets from Definition 12.41 The set 

M c u , a (x) = (M+ Q (x)°UM- Q (x)°) c 
is then an uncertainty region for the contour set of level u. 

The interpretation of this uncertainty region is important. The set M£ a is the smallest set such 
that all level u crossings of x are in the set with probability 1 — a. One should note that this definition 
of the un certainty region for level curv es is different from some other definitions in the literature. For 
example, iLindgren and Rvchlikl [1995l | define uncertainty regions as a union of intervals where each 
interval contains a single level crossing with probability 1 — a. 

It is somewhat unsatisfactory that the sets defined here are made unique by finding the largest set 
satisfying a certain restriction. The set E^ a {x) is for example defined as the largest set D satisfying 
P{D C A£(x)) > 1 — a, but there are also many other smaller sets satisfying the requirement, and 
these are not seen if only E+ a (x) is reported. Also, if one wants to know where the field likely exceeds 
the level u, the set E^ a {x) might not be sufficient since it does not provide any information about the 
locations not contained in the set. Therefore, it would be good to have something similar to p-values, 
i.e. the marginal probabilities of exceeding the level, but which can be interpreted simultaneously. 
To that end we introduce the excursion function, level avoidance function, and contour function as 
visual tools for answering such questions. 

Definition 2.6 (Excursion functions). The positive and negative u excursion functions are given by 

F+(s) = sup{l - a;s € E+ a }, 
F~(s) = sup{l - a;s G £~ Q }. 

Similarly, the level avoidance and contour functions are given by 

F u (s) = sup{l - a; s G M UjQ }, 
F u c (s) = 1-F u (s). 
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These functions take values between zero and one, and for a fixed level u and a fixed location s, 
this value is equal to 1 — a for the smallest a such that the location is a member of the excursion set. 
Thus, the set E' can be retrieved as the 1 — a excursion set of the function F'(s). The interpretation 
of the excursion function is therefore that if, for a given location s, the function takes a value close 
to one, this indicates that this location is a member of the excursion sets for almost all values of a, 
whereas if the value of the function is close to zero, the location is only a member of excursion sets 
with large values of a and it is therefore more unlikely that the process exceeds the level at that 
location. 



3 Computations 

So far, no assumptions have been made regarding the distribution of x(s), but to be able to calculate 
the excursion sets in practice we will now restrict ourselves to the clas s of latent Gauss ian models, 



which is a popular model class with many practical applications [see e.g. lRue et all l2009f | . Thus, the 



following problem setup is assumed. Let x(s) be a random field that can be written on the form 

k 

x(s) = J2A/<(s) + z(s) 
1=1 

where fi(s) are fixed effects and z(s) is a zero mean random field with covariance parameters 6\. 
Further assume that both z(s) and the parameter vector (3 = (j3\, . . . ,/3fc) T are a priori Gaussian. 

Let y = (yi,...,y m ) T be a vector of measurements of the latent field with some distribution 
7r(y|x b s , 62), where x b s is a vector containing the latent field evaluated at the measurement locations 
and 62 is a vector of parameters for the measurement distribution. Finally let si, . . . , s n be the set of 
locations where predictions of the latent field should be calculated and let x = (x(si), . . . ,x(s n )) T . 
The posterior distribution for x can then be written as 

7r(x|y) = J ^(x|y,0)7r(6>|y)d0, (2) 

where = (Oj , #2~) T , an d this is the distribution that should be used in the probability calculations 
when estimating the excursion sets. 

There are now, in principle, two main problems that have to be solved in order to find the excursion 
sets, level avoidance sets, or contour uncertainty sets: 

integration For excursion sets, calculate P(D C A^(x)) or P(D C A~(x)) for a given set D, or in 
the case of level avoidance sets or uncertainty regions for contour curves, calculate P(D~ C 
A-(x),D + C A+(x)) for the pair of sets (D + ,D-). 

optimization Use shape optimization to find largest region D satisfying the required probability 
constraint. 

Hence, given a method to solve each of the two problems, one could simply run the shape optimization 
algorithm and in each iteration calculate the required probability using the integration method. In 
theory there are no problems doing this, but in practice the integration method will be computationally 
demanding and it may not be feasible to use this strategy for applications involving large data sets. 
Therefore, we instead propose a slightly different strategy that will minimize the number of calls 
to the integration method by solving the problem sequentially. We first outline the strategy in the 
simplest possible situation, which will be used as a basis for all other more complicated strategies. 

The method is based on using an increasing parametric family for the excursion sets in combination 
with a sequential integration routine for calculating the probabilities. The advantage with using a 
sequential integration routine is that if the required probability has been calculated for some set D\, 
then the calculation for a larger set D2 D D\ can be based on the result for Di, resulting in large 
computational savings. 
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Algorithm 3.1 (Calculating excursion sets using a one-parameter family). Assume that the model 
parameters 6 are known and that the posterior distribution 7r(xjy,#) is Gaussian. Further assume 
that D(p) is a parametric family for the possible excursion sets, such that D(p\) C D{p2) if p\ < P2- 
The following strategy is then used to calculate E^ a . 

1. Choose a suitable (sequential) integration method for the problem. 

2. Reorder the nodes to the order they will be added to the excursion set when the parameter p is 
increased. 

3. sequentially add nodes to the set D according to the ordering given above and in each step update 
the probability P{D C A+(x)). Stop as soon as this probability falls below 1 — a. 

4- E^ a is given by the last set D for which P(DC (x)) > 1 — a. 

The computational savings of this sequential strategy are large. For example, assume that we 
want to find the positive level u excursion set E+ a (x), and have m candidates D(pi), . . . , D{p m ) to 
choose from. Using the naive optimization method, we would then have to check whether P(D(pi) C 
A£ ( x )) > 1 — a for each of these sets, and among the sets that satisfy the condition select the largest. 
Thus doing the probability calculation m times. However, by reordering the nodes and adding them 
sequentially we only have to run the integration routine once. Also note that the excursion function 
F£(s) is retrieved by setting a = 1 and in each step saving the probabilities P(DC A+(x)). Thus, 
the entire excursion function can be retrieved by running the algorithm once. 

Before extending this method to more general situations, we go into more detail on how to do the 
steps in Algorithm 13. II in practice. In Section \3.1\ a few sequential integration methods are presented. 
In Section 13.21 some different parametric families for the excursion sets and level avoidance sets are 
introduced and Algorithm 13.11 is extended using two-parameter families. The problem of how to 
optimally reorder the nodes is also discussed in this section. Finally in Section 13. 3| three different 
methods are proposed for calculating excursion sets under the full posterior distribution ([2]). 

3.1 Gaussian probability calculations 

For a Gaussian vector x, the probabilities P(D C A+(x)), P{D C A~(x)), and P(D + C A+(x),£>~ C 
A~(x)) can all be written on the form 

'(».b.S) = p^ F , / a ^ b ex P (-lx^-'x)dx, (3) 

where a and b are vectors depending on the mean value of x, the domain D, and on u. There have 
been considerable research efforts devoted to approximating integrals of this form in recent years, and 
we will in this section briefly describe a few techniques that can be used. 

The simplest way of approximating ([3]) is to use Monte-Carlo (MC) integration. However, es- 
timating the probability with any reasonable accuracy using standard MC integration is often too 
computationally expensive. Fortunately there are a number of variance reduction techniques that can 
be used to increase the efficiency 

A key step in many numerica l integration techniques is to transform the integral to make it more 
suitable for integration. Notably, iGenz |l992l | derived such a transformation for the Gaussia n integral 



O), t hough similar transformations have been suggested by other authors as well [see e.g. iGewekd . 
|l991|. Besides transforming the integral to the unit hyper cube, the transformation also achieves a 
separation of the variables so that the full problem can be calculated sequentially. The integral can 
then efficiently be evaluated using a quasi MC (QMC) method where the uniform random numbers 
in the ordinary MC integrator are replaced by some determini stic sequence of points chosen to reduce 



the probabilistic error bound of the crude MC integrator, see lGenz and Breta [2009l | for details. 
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A final variance reduction technique for the general int egration problem is to reorder the vari- 
ables before c alcula ting the integral, as first suggested by Schervishl |l984| and later improved by 



Gibson et al 



Il994| . The se reorderings can reduce the error by an order of magnitude, as shown 
by Genz and Bretz (20021 ] . However, the technique will not be applicable in our situation since the 



reordering will be determined by a parametric family for the excursion sets. 



3.1.1 Methods for Markov random fields 

A common assumption in spatial statics and image analysis is that the latent field can be m odeled, 
or approximated, using a Gaussian Markov random field (GMRF). See Rue and Held |2005l | for an 
introduction to GMRFs, and note th at GMRFs are used a lso for modeling in continuous space, for 
example using the SPDE approach by Lindgren et al. 201 1| . One of the motivating reasons for using 
GMRFs is that it reduces the computational cost for parameter estimation and spatial prediction, 
and because of this one would also like to be able to use the Markov property in the calculation of 

The main difference between latent GMRF models and standard Gaussian models is that the 
distribution is specified using the (sparse) precision matrix Q instead of the covariance matrix: 



J(a,b,Q) 



IQI 1 / 2 
(2vr) d /2 



1 



exp(— -x Qx) dx, 

a<x<b 2 



(4) 



Using the QMC methods on GMRF models is difficult without first inverting the precision matrix 
and then ignoring the sparsity of Q in the calculations. To take advantage of the sparsity of Q one 
can use the fact that any GMRF c an be viewed as a non -homogeneous auto-regressive process defined 
backwards in the indices of x [see iRue and Heidi . 120051 . Theorem 2.7], that is, if x is a GMRF with 
mean /x and precision matrix Q, then 



,X n ~ 



/ 1 n 

N ^-j- £ L a 

\ j=i+l 



a(xj — fij), L 



-2 



(5) 



where Lj,- are the elements of the Cholesky factor of Q. The integral can thus be written as 



/(a,b,Q) 



l>2 



7r(Xl|x 2: d) / Vr(x2|x 3:d ) 



(12 



b d - 



ad-\ 



n(x d -x\x d ) I ir(x d )dx 

o-d 



where, b ecctuse of the Markov structure, x^lxj^.^-^ only depends on the elements in Xj\f. n { i+ i :d }, and 
Mi is the neighborhood of i in the graph of the GMRF. 

If Q is a band-matrix, the integ ral can be efficiently calcu lated as a sequence of iterated one- 
dimensional integrals as discussed in iGenz and Kahanerl 1 19861 ] . However, the band width of L will 
often be too large for this method to be efficient, and a better alterna t ive is then of use a particle filter 
algorithm based on the GHK simulator Geweke, 1991 . Haiivassiliou . 1991 . Keane, 19931 ] . Denote the 
integral of the last d — i components by 2j , 



bi rbd-i 

ir(xi\x i+ i :d ) ■■■ Tr(x d _i\x d ) 



f-bd 

/ n(x d ) 

J ad 



dx, 



J o.d-1 J a d 

and note that the integral is the normalizing constant to the truncated density 

fi(-Xi; d ) oc l(ai-. d < x i:d < bj :d )7r(xj. d ). 

The integrals I d , . . . , I\ are now estimated sequentially using importance sampling. In the first step, 
calculate I d = <&(Ln(b d — fi d )) — Q(Ln(a d — fi d )), simulate N samples {x^jL-^ from the truncated 
normal distribution h d (x d ) oc l(a d < x d < b d )ir(x d ) 1 and set w d = I d . Next, simulate x d _ 1 from 
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= l{a d _i < < 6 d _i)vr(x d _i|x^) and set x^_ 1;d = {x 3 d __ v x 3 d } . The integral I d _i is 
estimated as I d —i ~ 77 SfLi wnere ^d-i are the importance weights 

' /_L ^-i(xd_i|x d )/i d (x d ) 



j < x rf _i :rf < b rf _i :rf )7r(x rf _i :d ) 



(6) 



Proceed like this, simulating from the truncated conditional distributions and in each step updating 
the importance weights recursively through 
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$ j Lu(bi - m)+ ^2 L ji( x j - Mi) - * I - Mi) + ^ ^ii(^i - Mi) 

i=*+i / V i=i+i 



w l+v 



To reduce the variance of the estimator when the target probability is small, a resampling step can 
be performed after having calculated the weights w\. This is a common strategy in particle filter 
techniques, and the sample {x^. d } is then updated by selecting N particles from the set, where x\. d 
is selected with probability Wif^2ks=l w i' To avoid resampling too often, one can do the resampling 
only if some criterion is met, for example if the effective sample size is below some given threshold. 

3.2 Parametric families 

In theory one can use any shape optimization technique to find the largest region D. However, since 
evaluating the probability P(x(s) > u, s € D) for a given set D is computationally expensive, one 
would like to do as few iterations as possible in this step. As discussed previously, we will solve 
this by assuming a parametric form of the sets D. The optimization can then be reduced to a 
standard optimization of only a few variables instead of doing a full shape optimization procedure. 
The parametric families will be based on the marginal quantiles of x(s), 

P(x(s) < q p (s)) = p, 

which are easy to calculate using only the marginal posterior distributions. The simplest one- 
parameter family based on the marginal quantiles is given in the following definition. 

Definition 3.2 (One-parameter family). Let q p (s) be the marginal quantiles for x(s), then a one- 
parameter family for the positive and negative u excursion sets is given by 

D+(p) = {s; P(x(s) > u) > 1 - p} = {s; P(x(s) < u) < p} = A+(q p ), 
D^(p) = {s; P(x(s) < u) > 1 - p} = {s; P(x(s) > u) < p} = A'^p). 

Using this one-parameter family in Algorithm 13. H is equivalent to finding a threshold value for the 
marginal excursion probabilities to get the c orrect simultaneous significan ce level. It is thus similar 
to the thresholding algorithms discussed in Marchini and Presanis 2003] but with the important 



difference that the correct joint, often non-stationary, posterior density is used when finding the 
threshold. 

The simple one-parameter family can be extended in a number of ways, for example by considering 
other levels in the excursion sets. 

Definition 3.3 (Two-parameter family). Let q p (s) be the marginal quantiles for x(s), then a two- 
parameter family for the positive and negative u excursion sets is given by 

D+{v, p) = {s; P(x(s) > v) > 1 - p} = {s; P(x(s) < v) < p} = A+(q p ), 
D^(v,p) = {s; P(x(s) < v) > 1 - p} = {s; P(x(s) > v) < p} = A-(q^ p ). 

The sets Df(v,p) and D^[(v,p) are increasing in p for a fixed v. 
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One drawback with this parametric family is that it does not take the spatial dependency of the 
data into account directly. Therefore certain sets which might seem reasonable to test are not included 
in the family. Consider the example in Section 14.11 Figure Q] Panel (a) , where the marginal excursion 
probabilities are shown in grey for an example in one dimension where the model is a Gaussian 
process with exponential covariance function. The estimated posterior mean in this example is shown 
as the black curve in Panel (b) in the figure, and in this situation a reasonable candidate for the 
O-excursion set might be a contiguous set centered at 1, [1 — Ax, 1 + A2] for some positive Ai,A2- 
However, looking at the marginal probabilities we see that sets on this form will not be included 
in the parametric family. One way of including such sets is to first smooth the marginal excursion 
probabilities = P(x(sj) > u) using some parametric smoother and then consider sets on the form 
{s;pj > 1 — p} where pj are the smoothed positive excursion probabilities. 

Definition 3.4 (Two-parameter smoothing family). Let pj be the smoothed marginal positive u 
excursion probabilities, using a circular averaging filter with radius r. A two-parameter family for 
the positive and negative u excursion sets is then given by 

D+(T,p) = {s; P r>l-p}, 

D 2 (r,p) = {s;pj < p}. 

The parameter r determines how close p[ is to the original excursion probabilities. For r = 0, 
no smoothing is done and for a general r, p\ is equal to the average of the marginal excursion 
probabilities in the disk with radius r centered at Sj. As r increases pj becomes smoother and 
approaches a constant function equal to the average excursion probability. One could also use other 
types of parametric smoothers instead of the simple averaging filter. 

Using the two-parameter families requires a modification to Algorithm 13.51 resulting in a slightly 
more computationally demanding method. 

Algorithm 3.5 (Calculating excursion sets using a two-parameter family). Assume that the model 
parameters 6 are known and that the posterior distribution 7r(x|y, 0) is Gaussian. Further assume 
that D{v,p) is a parametric family for the possible excursion sets, such that D(v,pi) C Diy^p^) if 
Pi < P2 f or a fixed v . The following strategy is then used to calculate E^ a . 

1. Choose a suitable (sequential) integration method for the problem. 

2. Select a suitable one- dimensional optimization strategy. 

3. Do optimization of the size of D(v,») over u: 

• For the current value of v , reorder the nodes to the order they will be added to the excursion 
set when the parameter p is increased. 

• sequentially add nodes to the set D according to the ordering given above and in each step 
update the probability P{D C A„ (x)). Stop as soon as this probability falls below 1 — a. 

• return the last set D for which P(OC (x)) > 1 — a. 

4- E^ a is given by the largest set D found in the optimization over v . 

The optimization can in this case be done using a Golden section search or a similar fast optimiza- 
tion procedure for one-dimensional problems. Algorithm 13.51 can also be used to estimate uncertainty 
regions for contour curves by using the following two-parameter family for the pair of level avoiding 
sets. 

Definition 3.6 (Parametric family for level avoiding sets). Let Df(pi) and D^{p2) be given by Defi- 
nition [321 A two-parameter family for the pair of level avoiding sets is obtained as (Df(pi), D^(p2)). 
A one-parameter family is obtained by requiring that p\ = P2 = P- 

The one-parameter family in Definition 13.61 can be used in Algorithm 13 . 1 1 to estimate level avoiding 
sets and uncertainty regions for contour curves without having to use the more computationally 
expensive Algorithm 13.51 
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3.2.1 Domain bounds and reorderings 

In the case of a GMRF posterior, it is desirable to make the Cholesky factor of the precision matrix as 
sparse as possible, because it reduces the number of floating point calculations that have to be done 
and reduces the error of the estimator. Reordering the nodes according to a parametric family does 
not guarantee good sparsity of the Cholesky factor, but the reordering can be improved by finding 
upper and lower bounds for the region. 

The simplest upper bound for the region is to use 

U 1 = {s: P(x(s) > u) > 1 - a}, 

which is calculated using only the marginal probabilities, and which is the largest region D if x(s) is a 
perfectly correlated field. The domain D cannot contain any locations s which are not in U\ because 
all points not in U\ have marginal probabilities lower than 1 — a of exceeding the level u. 
A simple lower bound for the region is obtained using Boole's inequality as 

Li = {s : P(x(s) > u) > 1 - a/n} 

where n is the number of points in the discretization of the domain. In terms of multiple hypothesis 
testing, this lower bound is obtained from the classical Bonferroni correct i on m ethod and an improved 



lower bound can be obtained using the Holm-Bonferroni method [Holml . Il979l | as 

L 2 = {s : P( fc ) > 1 - a/k} 

where is the kth largest probability in the set {P(x(sj) > u),i = 1, . . . , n}. If the stochastic 
variables x(sj) are independent, L2 is the largest domain D. If the variables are not independent or 
perfectly correlated, one has L2 C D C U\. 

The nodes can now be categorized into three classes, the first class contains the nodes included in 
the lower bound L2 , the second class contains the nodes in the set Ui \ L2 and the third class contains 
all other nodes. Since one knows that all nodes in L2 will be included in D, these can be reordered 
to maximize the sparsity of the Cholesky factor, for example using an approximate minimum degree 
permutation. The nodes in the second class are then added in the order determined by the parametric 
family. Finally, since the nodes in the third class will not be included in the domain, these can be 
reordered to maximize the sparsity or integrated out of the posterior distribution. Making the bounds 
more precise will improve the sparsity of the problem and therefore reduce the Monte-Carlo error and 
the computational complexity. 

3.3 Probability calculations for the latent Gaussian setting 

In practice, we cannot use the computations from the previous sections directly unless we are in a 
purely Gaussian setting with known parameters. In the latent Gaussian setting with posterior 
the method has to be m odified. Since th is is a latent Gaussian setting, Integrated Nested Laplace 



Approximations (INLA) [Rue et all [2009] is a good alternative for estimating the posterior distribu- 



tions 7r(x|y, 6) and 7r(#|y); however, the choice of method to use for estimating these distributions 
does not affect the excursion set estimation so any appropriate method can be used to estimate these 
distributions. Given estimates of the posterior distributions, we propose three methods for calculating 
the excursion probabilities, assuming that 7r(x|y,#) is Gaussian: 

EB: (Empirical Bayes) Ignore the parameter uncertainty and calculate the probability conditionally 
on a parameter estimate. That is, estimate the excursion sets under the conditional poste- 
rior 7r(x|y,#o) where Oq for example is the maximum a posteriori estimate or the maximum 
likelihood estimate of 0. 
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QC: (Quantile Correction) Do a correction to the Gaussian probability calculations based only on 
the marginal posteriors in the following way. For each i use the marginal posterior to calculate 
P(xi > aj|y) and P(xj < 6j|y) and calculate a, and hi so that P(^ > S»|y, 0q) = P(xj > a«|y) and 
P(zj < bi\y, Go) = P(xi < bi\y). An estimate of the probability is then given by I(a, b, Q(#o)), 
where Q(#o) is the posterior precision matrix for x given the estimated parameters. 

NI: (Numerical Integration) Numerically approximate the excursion probability by approximating 
the integral in ([2]) as 



P(a < x < b|y) = E(P(a < x < b|y, 6*)) « ^ Wi P(a < x < b|y, 0;) 

i=l 

where the configuration of the po ints Qj in the hy per parameter space can, for example, be 
chosen as in the INLA method [see I Rue et all l2009l | and the weights Wi are chosen proportional 
to 7r(0j|y). 

The EB method is the simplest, and may be sufficient in many situations. The QC method is 
based on correcting the limits of the integral so that the probability would be correct if the Xj's were 
independent. This method is as easy to implement as the EB method and should perform better in 
most scenarios. Finally, the NI method is k times more computationally demanding as the probability 
has to be calculated for each parameter configuration 6i, but should also be the most exact method. 
If the number of parameters is small one can often obtain accurate results with only a few parameter 
configurations, but the accuracy of the estimator will depend on how these configurations are chosen. 

A second modification is required if the conditional posterior 7r(x|y,#o) is n °t Gaussian. The 
simplest solution to this problem is to do a Gaussian approximation TTg(x\y, 6ci), for example using 
Laplace approximations or simplified Laplace approximations as suggested by I Rue et al.l 2009] . If a 
Gaussian approximation is not sufficient, the sequential integration m ethod has to be modifi ed, and 
how to do this will depend on the posterior distribution. For example, iGenz and Breta [2009i | outline 
how the quasi Monte Carlo methods can be extended to ^distributions, and the GHK-based particle 
filter method can be extended to other types of distributions as well. However, it is outside the scope 
of this article to go into details on how to handle the non-Gaussian cases and we therefore leave this 
for future research. 



4 Tests on simulated data 

In this section, three examples using simulated data are presented to illustrate the methods and test 
their accuracy. In the first example, we look at a problem in one dimension with known model pa- 
rameters, where a latent Gaussian process with an exponential covariance is observed under Gaussian 
measurement noise. In the second example, we compare the different parametric families for contour 
uncertainty sets for a model in two dimensions with known parameters, where a latent Gaussian 
Matern field is observed under Gaussian measurement noise. In the third example, the same spatial 
model setup is used, but this time the model parameters are estimated from data and the three 
methods for handling the full posterior distribution are compared. 



4.1 Example 1: Id Gaussian data with known parameters 

We begin with a simple one-dimensional example to illustrate the different sets we have previously 
defined. Let x(s), s 6 [0,2] be a Gaussian process with an exponential covariance function with 
scaling parameter A = 1 and mean 



fj,(s) 



s-0.5 if s < 1 
1.5 -s if s > 1. 
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Figure 1: Results from Example 1. Panel (a) shows the excursion function F (s) (red) and the 
marginal excursion probabilities p(s) = P(x(s) > 0) (grey). Panel (b) shows Eq 0Q5 (x) in red, 
obtained as Aq 95 (Fq). The grey area shows Aq 95 (p), which is the upper bound U\, and the dark red 
set is the lower bound L<i- The black curve is the kriging estimate of x(s). 

We generate a trajectory from the model and observe it at 500 locations s\, . . . , s n drawn at random 
in the interval under Gaussian measurement noise, giving us observations yi = x(si) + £j where 
£i ~ N(0, a 2 ). We do spatial prediction (kriging) to 1000 equally spaced locations in the interval 
given the parameters 6 and the measurements y, and then estimate the positive 0-excursion function 
Fq(s) using the parametric family Df(0,p). In Figure [Tj Panel (a), Fq~(s) is shown in red together 
with the marginal excursion probabilities P(x(s) > 0) in grey. 

By the definition of Fq~(s), the positive 0-excursion set Eq (x), is obtained by calculating the 
1 — a excursion set of the function F + (s), and this set is shown for a = 0.05 in red in Figure [TJ Panel 
(b). The grey set shows the upper bound U\, which is the set where P(x(s) > 0) > 1 — a, and the 
dark red set shows the Holm-Bonferroni lower bound L2. The black curve shows the kriging estimate 
of the process given the data. Note that the grey and red sets are obtained as excursion sets of the 
grey and red functions in Panel (a), and also note that L2 C Eq (x) C U\. 

We now want to verify that the estimated sets Eq (x) have the correct excursion probability, 
that is, that P(x(s) > 0, s € Eq (x)) = 1 — a. To that end, draw N samples, x\(s), . . . ,xn(s) from 
7r(x|y, 0), count the number of samples for which inf{x(s), s € Eq a (x)} > 0, and denote this number 
by N s . Further let p(a) denote the proportion of samples, N s /N, that satisfies the requirement. If 
EQ a (x) is correctly estimated, p(a) should be close to 1 — a. In Figure [2] Panel (a), the difference 
1 — a — p{oi) is shown as a function of 1 — a. The difference is calculated twice, using two different 
estimates p(a), each based on N = 50000 samples. As can be seen in the figure, the difference is very 
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Figure 2: Results from Example 1. Panel (a) shows two estimates of 1 — a — p(a) as a function of 
1 — a. p(a) is an estimate of P(x(s) >0,s£ EQ a (x)) based on Monte-Carlo simulation of x(s), which 
should be close to 1 — a if E^ a {x) is correctly estimated. The two curves show two results for two 
different Monte-Carlo simulations. Panel (b) shows the estimated contour uncertainty set Mq 005 (x). 

small for all values of a, and the difference that can be seen is mostly due to the Monte-Carlo error 
in the estimation of p(a), which has nothing to do with the accuracy of the method. Thus the sets 
E^ a indeed have the correct excursion probabilities. 

Finally in Figure [2j Panel (b), the 0-contour uncertainty region Mq 05 (x) is shown in red and 
the kriging estimate of x(s) is again shown in black. The set was estimated using the two-parameter 
family for level avoidance sets from Definition 13.61 and Algorithm 13.51 The complement of this set is 
the union of the level avoiding sets (Mq Q 05 (x),Mq 05 (x)), which is the largest pair of sets (D + , D~) 
satisfying P(D~ C A~(x), D+ C A+(xj) > 0.95. 



4.2 Example 2: 2d Gaussian data with known parameters 

In this example, we change to a spatial model to test the parametric families for contour sets. Let 
x(s), s E [0, 10] x [0, 10], be a Gaussian field with a constant mean p = and a Matern covariance 
function 

c(||h||) = - /~ P(t>2 , - {k\\H) v km\H\ (7) 

where v is a shape parameter, n 2 a scale parameter, (f) 2 a variance parameter, K v is a modified Bessel 
function of the second kin d of order v > , and || • || denotes the Euclidean spatial distance. We use the 



SPDE representation by iLindgren et al.l [20111 ] of the field using a triangulation based on an 80 x 80 
regular lattice in the region. The representation is a piecewise linear approximation x(s) ~ ^ i Xi<fi(s) 
of the field using 6400 piecewise linear functions (pi(s), each centered at one of the nodes in the lattice. 
The advantage with this representation is that it allows us to do all calculations using the weights x 
of the basis expansion, which form a Gaussian Markov random field. 

We set v = (f) = 1, and k 2 = 0.5, and generate a sample of the field and measure it at 1000 
locations in the square, chosen at random, under Gaussian measurement noise, giving us observations 
Hi = x(si) + £j where £j ~ N(0, a 2 ) and a = 0.1. The posterior estimate (kriging) of x|y can be seen 
in Figure [3l Panel (a), and the uncertainty region Mg Q 5 (rc) for the 0-contour can be seen in Panel 
(b). In this case the Mq 05 (x) was estimated using the one-parameter family in Definition 13.61 and 
it is now of interest to test how much is gained by using the two-parameter family from the same 
definition instead. 

To that end, we generate 50 data sets using the same setup, and for each data set estimate 
Mqqq 5 (x), first using the one-parameter family (Df (p) , (p)) , and then using the more general 
two-parameter family (Df(pi),D^(p2))- Since the one-parameter family is a special case of the 
two-parameter family where p\= pi = P, the contour sets estimated with the two-parameter family 
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Figure 3: Results from Example 2. Panel (a) shows a kriging estimate and Panel (b) shows the same 
estimate where the corresponding estimated contour uncertanty set Mq 005 (x) is superimposed in 
grey. 

should always be smaller than the one-parameter sets. However, using the two-parameter family, the 
estimated sets are on average only 0.2% smaller than if the one-parameter family is used, so in this 
case it is arguably not worth the extra computational effort to use the two-parameter family, although 
for other levels u, or other latent models, the difference might be larger. 

4.3 Example 3: 2d Gaussian data with unknown parameters 

In this example we compare the three methods, described in Section l3~3l for handling the full posterior 
distribution (|2|) in the calculations. The same Gaussian Matern model is used as in Example 2, with 
the difference that we now also estimate the parameters from the data. 

We set v = (J) = 1, and k 2 = 2 in the covariance function (|7|), and generate a sample of the field 
and measure it at 1000 locations in the square, chosen at random, under Gaussian measurement noise, 
giving us observations yi = x(sj) + £j where E{ ~ N(0, a 2 ) and a = 0.5. Given the measurements, we 
estimate the parameters and the marginal posterior distributions using INLA. The posterior estimate 
(kriging) of x|y can be seen in the lower right panel of Figure [H and in the lower left panel, the 
marginal probabilities P(x(s) > 0|y) are shown. 

We now estimate the excursion function F^(s) using the three different methods described in 
Section 13.31 and the one-parameter family from Definition 13.21 for the excursion sets. These can be 
seen in the upper panels of Figured Visually it is in this case difficult to see any differences between 
the three estimates of the excursion function. To compare the accuracy of the estimates we will do a 
simular comparison to the one performed in Example 1, where Monte-Carlo simulation was used to 
estimate p(a), the proportion of samples satisfying inf{x(s),s 6 EQ a (x)} > 0, which should be close 
to 1-aif E+ a (x) is correct. 

There are three possible sources of errors in this comparison. The first one is the Monte-Carlo 
error from the estimation of p(ct), which has nothing to do with the accuracy of the method. The 
second error is the Monte-Carlo error in the probability estimation when estimating the excursion 
distribution functions. This error is, however many orders of magnitude smaller in this case. The 
final error is the approximation error induced by using any of the three methods EB, QC, or NI for 
handling the full posterior distribution. 

To investigate this approximation error, the difference 1 — a — p(a) is estimated for the three 
estimates of Fq(s). First we base the estimate on 20000 samples from the posterior 7r(x|y), obtained 
using the MCMC sampler described in Appendix [A) In Figure [5j Panel (a), the results can be seen 
for the EB method (blue), the QC method (green), the NI method with k = 45 parameter settings 
(red), and the NI method with k = 15 parameter settings (cyan). The comparison was done twice, 
with two different samples of size 20000 when calculating p(a), and the curves of the same color show 
these two and give an indication of the size of the Monte-Carlo error in the comparison. As seen in 
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Figure 4: Results from Example 3. In the top row, three estimates of the excursion function can be 
seen using the EB method (left), the QC method (middle), and the NI method with 15 parameter 
configurations (right). In the bottom row, the marginal p-values for exceeding the limit can be seen in 
the left panel, using the same color scale as for the top row. The middle panel shows the set Eq Q 05 (x) 
given by excursion function estimated by the NI method. Finally the right panel shows the kriging 
estimate of the latent field. 



the figure, the NI method performs best, as expected. 

The error using the NI method comes from the fact that only finitely many points are used 
in the integration when approximating the posterior distribution for the parameters. That is, the 
full posterior ir(0\y) is approximated by a discrete distribution with point masses at the parameter 
configurations 6i used in the integration, ir(0i) = Wi. To verify that this indeed is the source of the 
error in, we construct a second Monte-Carlo sampler where we instead of sampling 6 from the full 
posterior 7r(0|y) sample it from the discrete distribution defined by the 45 parameter configurations 
used in the first NI method. Panel (b) in Figure [5] shows the same comparison as Panel (a) but 
where 6 is sampled from the discrete distribution. As expected, the error for the NI method with 45 
parameter configurations is now smaller. 

The Monte-Carlo error from estimating p(a) is quite large in Figure [5j so to get a better under- 
standing of the other errors, a larger study was also performed where the procedure in Figure [5] was 
repeated 50 times for 50 different simulated data sets, and for each data set iV = 60000 draws from 
the posterior was used when estimating p(a). The average errors of these 50 runs can be seen in 
Figure [6j In Panel (a) the results using samples from the full posterior is shown, and in Panel (b) the 
results using the discrete distribution for is shown. Note that the red curve is very close to zero 
in Panel (b), indicating that the error in the NI method mostly depends on choosing the integration 
points for 6 so that they capture the true posterior distribution well. Also note that the QC method 
performs well for large values of 1 — a, and since one most often is interested in finding the excursion 
sets for small values of a, this method is then a good way of finding such sets with less computational 
effort than using the NI method. 
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Figure 5: Results from Example 3 showing the difference 1 — a — p{a) as a function oil — a for the 
different approximation methods, EB (blue). QC (green), NI using 45 parameter configurations (red), 
and NI using 15 parameter configurations (cyan). p(a) is an estimate of P(x(s) > 0,s 6 EQ a (x)) 
based on MCMC simulation of x(s), which should be close to 1 — a if Eq (x) is correctly estimated. In 
Panel (a), p(a) is estimated using the full posterior distribution, and in Panel (b), p(a) is estimated 
using the discrete posterior distribution defined by the 45 parameter configurations from the NI 
method. The comparison was done twice, with the two different estimates of p(a), each based on 
20000 samples of x(s), and the curves of the same color shows these two. 



5 Applications 

In this section, we will use the techniques described above in two different applications. In the first, we 
study air pollution data from Piemonte region in northern Italy and estimate regions where the daily 
limit for PMio (particulate matter with an aerodynamic diameter of less than 10 //m) is exceeded. 
In the second application, we study vegetation index data from the African Sahel and estimate areas 
that experienced a significant increase in vegetation after the drought period in the early 1980's. In 
both of these applications the data sets are large, and the Markov structure of the latent Gaussian 
models has to be used in the calculations. 



5.1 Air pollution data 



High levels of air pollution can be harmful for the ecosystems and the human health. The effects 
on human health ranges from minor effect s to the cardio-respiratory system to premature mortality 



Cohen et all I2009I . ICameletti et all . |2012| . Because of this, environmental agencies have to assess 



the air quality in order to take proper actions for improving the situation in polluted areas, and an 
important tool in this process is the ability to produce continuous maps of air pollution. 

A region where the daily limit values fixed by the European Union for human health protection (see 
EU Coun cil Directive 1999/30/E C) are periodically exceeded is the Piemonte region in northern Italy. 
Recently, Cameletti et al. 2012| proposed a statistical model to capture the complex spatio-temporal 
dynamics of PMio concentration in the region and used it to produce daily maps of PMio. They also 
produced daily maps of exceedance probabilities of the value BOfig/m 3 , which is the value fixed by the 
European directive 2008/50/EC for the daily mean concentration that cannot be exceeded more than 
35 days in a year. These probability maps only considered the marginal excursion probabilities, and 
no attempts of producing maps of simultaneous exceedance probabilities were made. In the following, 
we will therefore consider the same model and data but also estimate the excursion functions for the 
5 nq/m? limit va lue . 



Cameletti et al.1 [2012l | considers daily PMio data measured at 24 monitoring stations by the 
Piemonte monitoring network during 182 days in the period October 2005 - March 2006, the data is 
provided by Aria Web Regione Piemonte. Denoting the measurements made at location Sj at time t 
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Figure 6: Results from Example 3 showing the difference 1 — a — p(a) as a function oil — a for the 
different approximation methods, EB (blue). QC (green), NI using 45 parameter configurations (red), 
and NI using 15 parameter configurations (cyan). The same setup as in Figure [5] was repeated 50 
times for 50 different datasets and 60000 samples were used when estimating p(a). This figure shows 
the average error of these 50 runs. 



by y(si,t), the following measurement equation is assumed, 

y(si,t) = x(si,t) + e(si,t), 



(8) 



where e(si,t) ~ N(0,o"^) is Gaussian measurement noise, both spatially and temporally uncorrelated, 
and x(si,t) is the latent field of true unobserved air pollutions. The latent field is assumed to be on 
the form 



:(si,t) = ^2 Zk(8i,t)Pk + €(si,t), 



(9) 



fc=i 



where the p = 9 covariates Zh are u sed and £ is a spatio-temporal Gaussian random field. Based 
on the work of Cameletti et al. 201 1| the following covariates were used: 1) Daily mean wind speed; 
2) daily maximum mixing height; 3) daily precipitation; 4) daily mean temperature; 5) daily emissions; 
6) altitude; 7) longitude; 8) latitude; and 9) intercept. These covariates are provided with hourly 
temporal resolution o n a 4 km x 4 km reg ular grid by the environmental agency of Piemonte region 

The spatio-temporal process £ is assumed to follow first 



( Arpa Piemonte) , see iFinardi et al. 



order autoregressive dynamics in time with spatially dependent innovations: 

£(si,t) = a£(s i; i- 1) +u(si,t), 



(10) 



where \a\ < 1 and w(sj,i) is a zero- mean temporally independent Gaussian process characterized by 
the spatio-temporal covariance function 



Cov(u(s i ,ti),u;(s j ,t2)) 




if h / t 2 
otherwise, 



(11) 



where C(-) is a Matern covariance function given by ([7]). The model parameters and the poste- 
rior distribution for the latent field ar e the n estimated using INL A in combination with the SPDE 



representation of iLindgren et al.l [201 lj , see 



Cameletti etHI |2012l | for details. 
The map of marginal excursion probabilities for the level 50 fig /m 3 for January 30, 2006, based 
on the estimated posterior distribution for x, can be seen in the left panel of Figure [71 To avoid 
inappropriate linear extrapolation of the effect of elevation beyond the range of the elevation of the 
observations, the results are only shown for areas below 1km. Based on these results, we now estimate 
the positive excursion function for the level 50/ug/m 3 , -F^(s), using the NI method from Section 
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Figure 7: Results from the PMio application for January 30, 2006. A map of the marginal exceedance 
probabilities for 50 fig /to 3 (left), and the joint excursion distribution function for the level (right). 

and the parametric family of excursion sets from Definition [3]2j A total of 25 parameter configurations 
are used in the integration. The result can be seen in the right panel of Figure [71 As seen in the figure, 
there are three regions where the level is clearly exceeded, and a fourth that possibly contains too high 
pollution levels. As expected, these areas coincide with the locations of the main metropolitan areas 
in the region; Turin, Novara, Verceili, and Alessandria. In this case, it would have been desirable to 
make the predictions on a finer spatial scale, but since the covariates were given on a 4 km x 4 km 
grid, this spatial resolution had to be used in the prediction. 

To get a better understanding of the results, it is also of interest to find the regions where the 
pollution level is simultaneously below the limit value with some given probability. The marginal 
probabilities for being below be the level 50 fig /m 3 , based on the estimated posterior distribution for 
x, can be seen in the left panel of Figure [HJ The results are again only shown for areas below 1km 
altitude. Using the same method as for the positive excursion function, we now estimate the negative 
excursion function for the level 50 fig /m 3 , F^q^s). The result can be seen in the right panel of Figure 

B 

Note that the union of E^ x (x) and E 50 x (x) covers only a small part of the region, indicating 
that the uncertainty in the problem is large. See the red and blue sets in the left panel of Figure [9j 
Also, by taking the complement of the set E^ Q i(x), we get the region that contains all exceedances of 
the level with certainty 0.9, indicated in grey in the left panel of Figure [9j This set is large, indicating 
that there are many regions where the level possibly is exceeded. Hence, it is important to note that 
the positive excursion set E^ 1 (x) is small because the uncertainty is large in the problem, and not 
because the other regions certainly have concentrations below the level. 

To verify that the uncertainty is large, we finally calculate the contour function for the level 
50/ro/m. 3 , F£ (s), using the NI method and the one-parameter family from Definition 13.61 for the pair 
of level avoiding sets. The result can be seen in the right panel of Figure [9j and the uncertainty region 
for the contour curve indeed covers a large part of the region which indicates that the uncertainty in 
the estimated contour curve is large. 
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Figure 8: Results from the PMio application for January 30, 2006. A map of the marginal probabilities 
for the field being below the level 50/ug/m 3 (left), and the joint negative excursion distribution function 
for the level (right). 



5.2 Spatially dependent temporal trends in vegetation data 

Trends in vegetation cover are related to changes in climatic drivers, feedback mechanisms between the 
atmosphere and land surface, and human interaction. A region with rapid recent changes is the African 
Sahe l . This zone has received much attention r egarding desertification and climatic variations Olsson , 
199,j iNicholsonl . l2000l . lLambl . Il982| . Recently, lEklundh and Olssonl |2003| observed a strong increase 
in seasonal vegetation index over parts of the Sahel using Advanced Very High Reso lution Radiometer 
(AVHRR) data from the NOAA /NASA Pathfinder AVHRR Land (PAL) database (Agbn and .Tamed . 
19941 . iJames and Kalluril . [1994J, for the period 1982-1999. The study was based on ordinary least 
squares linear regression on individual ti me series extracted for each pixel in the satellite images. 
The results of Eklundh and Olsson 2003| were later improved by Bolin et al. 2009 1 where a spatial 
model for the vegetation was used in the analysis to capture the spatial dependencies in the trend 
estimation. 

To find regio ns where changes in the ve getat ion have occurred over the course of the studied 



time period, both Eklund h and Olssonl |2003| and IBolin et al.l |2009l | used significance testing for the 



individual pixels in the field. Thus, pixels that individually had significant changes in vegetation were 
found, but no at tempts were made to find simultaneous excursion regions. Here, we will use a similar 
model to that of lBolin et al.l 2009] but also estimate joint excursion regions for the vegetation trends. 
Assume that the vegetation measurements year t are generated as, 

Y t \X t ,S et G N(A t X t ,£ e J, 

where is the latent vegetation field with prior distribution 7r(Xf), S et is a measurement noise 
covariance matrix, and Aj is an observation matrix determining which pixels in the field that are 
observed. To estimate time varying trends in the observations, X is restricted to follow a field of 
spatially varying linear trends: 

X t = K! + tK 2 (12) 

The prior distribution for K = [K7,K^] t is obtained by evaluating the joint distribution for X = 
[X^ 



1 > ■ 



, Xj] T conditionally on the restriction (11211 . We choose a second-order polynomial IGMRF 
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Excursion sets 



Contour function Fg (s) 




Figure 9: Results from the PMio application for January 30, 2006. In the left panel, the set E^ Q 1 (x) 
is shown in red, 1 (x) in blue, and it's complement 1 (x) c in grey. The contour curve for the 
level 50/j,g/m 3 is shown in green. The right panel shows the contour function for the level 50 fig /m 3 , 



Rue and Hell. 1200,4 S ection 3.4.2] prior for X and then calculate the corresponding prior distribution 



for K, [see iBolin et al I . I2Q09L for details]. 

To complete the model, the structure of 5] e needs to be determined. Many of the factors that the 
measurement noise should model are local phenomena, such as aerosol and cloud cover. Si nce it seems 



unrea sonable that the scale of these disturbances would be the same over the entire region, IBolin et al. 
[2009] assumed that the measurement noise was uncorrelated with a different noise variance at each 
pixel in the field. This results in a large number of parameters for the measurements noise, one for 
each pixel in the field, so here we instead use a different slightly simplified noise model. We divide the 
region into five different land cover categories using the Africa Land Cover Characteristics Data Base 
Version 2.0 (http://edc2.usgs.gov/glcc/glcc.php): 1) Bare desert; 2) Semi desert; 3) Savanna; 
4) Crops, grass, and shrubs; and 5) Forests and wetlands. The measurement noise variance at pixel 
s,- is then modeled as 



logcr 2 (sj) = J2 9 kh(si), 



(13) 



k=l 



where bk(s) is the spatial basis function with values equal to the proportion of vegetation type k at 
each pixel s. The parameters of the model are thus the scale parameter k and the five measurement 
noise parameters 6\, . . . ,65. A gamma prior is assumed for k and gaussian priors are used for k . 

We choose to study the western part of the Sahel region, and this area is divided into 35463 pixels 
of size 8 km x 8 km, so the field K has 70926 elements, and there are 547832 measurements from 17 
years of data starting in 1982 and ending in 1999. 

The model parameters and the marginal posterior distributions are estimated using the INLA 
framework and the excursion sets are estimated using the QC method from Section 13.31 The results 
can be seen in Figure [10] and Figure [11] The top panel in Figure [10] shows the posterior estimates 
of the intercepts, Ki, and the slopes, K2, is shown in the bottom panel. As expected, intercepts are 
larger in the savanna regions to the south, and smaller in the semi desert regions to the north. The 
top panel of Figure [TT1 shows the estimated standard deviation of the measurement noise using model 
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Figure 11: Results from the Sahel vegetation data. The top panel shows the estimated standard devi- 
ation of the measurement noise and the bottom panel shows the estimated excursion set Eq 05 (K2) 
in red and the point-wise positive significant trends in green. 
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(|13|) . It is worth noting that these results look reasonable, with larger measurement errors in the 
coastal region and where there are forests and wetlands, and smaller measurement errors in desert 
and semi desert regions. Finally the bottom panel of Figure [11] shows the estimated excursion set 
^0~0 05(^2) m red and the point-wise positive significant trends in green. The interpretation of the 
result is that one with high certainty can conclude that the areas indicated in re d have experienced an 



increa se in vegetation over the studied time period. Hence, conclusions drawn bv lEklundh and Olsson 



2003] seem valid, also when taking the spatial dependency of the vegetation into account and when 



estimating the excursion sets controlling the family- wise error. 

6 Discussion 

Estimating excursion sets and uncertainty regions for contour curves for stochastic fields are difficult 
problems, both because of computational issues but also because it might not be clear how such 
uncertainty regions should be defined. In this work, we have given precise definitions for these 
uncertainty regions, introduced the concept of excursion functions as a visual tool for illustrating 
the uncertainty in these regions, and presented a method for calculating these quantities for latent 
Gaussian models. 

The main idea behind the computational method is to use a parametric family for the excursion 
sets in combination with a sequential integration method to reduce the computational effort required 
when estimating the sets in practice. Tests on simulated data showed that the method is accurate, and 
two applications were presented to show that the method is applicable even to large environmental 
problems. 

There are a number of extensions that could be made to this work. First of all, using the one- 
parameter family for the excursion sets gives a method that falls into the broad category of p- value 
thresholding methods for estimating simultaneous excursion sets. As previously mentioned, the im- 
portant advantage with the method proposed here compared with other commonly used thresholding 
methods is that the correct joint distribution is used when selecting the threshold. The disadvantage is 
that the method is computationally more expensive than many of the standard thresholding methods. 
It would, therefore, be interesting to do a comparison with other similar methods with respect to the 
accuracy and computational complexity. Another interesting comparison wo uld be to compare the 



uncer tainty regions for contour curves produced by these methods to those of Linderen an d Rvchlik 



1995]. One could potentially also combine these methods with the work by Polfeldt 1999 ] to make 



statements on the quality of contour maps. 

We also presented other parametric families that can be used to obtain more complicated methods 
for estimating the excursion sets, with the possibility of finding more precise estimates under the cost 
of higher computational complexity. Initial comparisons showed that there is not much gain in using 
these more complicated methods, but so far these comparisons have only been made using fairly simple 
latent models, and the gain is likely higher when the latent models are more complex. Hence, more 
studies are required to verify if this is the case and to investigate in what situations it is appropriate to 
use the simple one-parameter families. One possible advantage with the more complicated parametric 
families is when one has prior knowledge regarding the shape of the excursion sets. For example, 
if one knows that the excursion sets should be large contiguous regions, such knowledge could be 
incorporated using a two-parameter smoothing family. 

As a final note, the method introduced here is available as a C-package with interfaces to both R 
and Matlab, see the supplementary material for the online version of the article for details. 
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A Notes on the MCMC algorithm used in Example 3 

To generate samples from the posterior distribution 7r(x|y) in Example 3 , an MCMC alg o rithm 
is used. The algorithm is a random- walk Metropolis Hastings algorithm Metropolis et al. . 19531 . 
Hastings! . Il97d | with proposal kernel 



g({x o id.0old}.{x,0}) = 7r ( x |y> )©(^old^)- 

A new proposal of the parameters 6 = (log(cr), log(re), log (</>)) is proposed based on the old value 
6?old using ~ N(0 o i(j, Hg). Here Sg is a scaled version of the Hessian m atrix evaluated at th e 



maximum posterior estimate of 6. The scaling is selected as suggested by iGelman et al.l [199f 
A new value for x is then proposed using the marginal posterior distribution 7r(x|y,<?) given by 

x|y, ~ N(4yQ A T y, Q). Here Q = Q + ^2-A T A, where Q is the precision matrix for x and 
A is an observation matrix determined by the measurement locations. The acceptance probability 
simplifies to 

7r(0 o ld|y) 



"MCMC = min ( !> ^(flly) 



where the posterior 7r(0|y) is given by 

^ |y) "T^i exp l^ y l~ — J y J' ( } 

Since the proposal for x does not affect the acceptance probability, a new proposal for x is only 
generated if 6 is accepted. With only three parameters in the model, we achieve good mixing this 
way, but being an MCMC-procedure it is still highly computationally demanding since the calculation 
of the acceptance probability requires a few Cholesky factorizations and back substitutions based on 
the posterior precision matrix for x. 
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